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The ringdown phase following a binary black hole merger is usually assumed to be well described by 
a linear superposition of complex exponentials (quasinormal modes). In the strong-field conditions 
typical of a binary black hole merger, non-linear effects may produce mode coupling. Artificial mode 
coupling can also be induced by the black hole's rotation, if the radiation field is expanded in terms 
of spin- weighted spherical harmonics (rather than spin- weighted spheroidal harmonics). Observing 
deviations from the predictions of linear black hole perturbation theory requires optimal fitting 
^ ' techniques to extract ringdown parameters from numerical waveforms, which are inevitably affected 

by numerical error. So far, non-linear least-squares fitting methods have been used as the standard 
workhorse to extract frequencies from ringdown waveforms. These methods are known not to be 
optimal for estimating parameters of complex exponentials. Furthermore, different fitting methods 
have different performance in the presence of noise. The main purpose of this paper is to introduce 
the gravitational wave community to modern variations of a linear parameter estimation technique 
first devised in 1795 by Prony: the Kumaresan- Tufts and matrix pencil methods. Using "test" 
| damped sinusoidal signals in Gaussian white noise we illustrate the advantages of these methods, 

showing that they have variance and bias at least comparable to standard non-linear least-squares 
techniques. Then we compare the performance of different methods on unequal-mass binary black 
hole merger waveforms. The methods we discuss should be useful both theoretically (to monitor 
errors and search for non-linearities in numerical relativity simulations) and experimentally (for 
parameter estimation from ringdown signals after a gravitational wave detection). 
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I. INTRODUCTION 



The last year witnessed a remarkable breakthrough in numerical relativity. Many different groups were finally able 
to evolve black hole binaries through the last few cycles of inspiral, merger and ringdown and to extract gravitational 
waveforms from the evolutions [l[ [H, [H, 0, [f| d, 0, H, H, Hoj . If numerical simulations start out at large enough 
separation, it should be possible to match them with Post-Newtonian predictions for the inspiral signal. This is 
a very active research area [TJ EH EH- Studies of equal-mass merger simulations starting from different orbital 
separations (or produced by different numerical techniques) show that, independently of the waveform accuracy in 
the pre-merger phase, the strong-field waveform has a "universal" shape. Preliminary, "first order" explorations of 
the merger waveform for equal-mass black hole binaries indicate that the merger phase is short-lived, lasting only 
~ 0.5 — 0.75 gravitational wave (GW) cycles |12|. After this short merger the signal has the typical "ringdown" shape 
- i.e., it is well modelled by a superposition of complex exponentials, known as quasinormal modes (QNMs). 

QNMs play a role in any astrophysical process involving stars and black holes. Oscillations of a relativistic star 
and of a black hole necessarily produce GWs [3, EE EH ■ In linear perturbation theory the metric perturbations can 
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usually be described by a single scalar function, which, in the ringdown phase, can be written as a superposition of 
complex exponentials: 

*, m (t) = ]T A lmn e^ +iu ^ t+i ^ . (1.1) 

Here Ai mn , fimni ^imn and Ti mn = — ocT are the mode's amplitude, phase, frequency and damping time respectively. 
The indices {I , m) describe the angular dependence of the signal, and the index n sorts the modes by the magnitude 
of their damping time. The amplitude and phase is determined by the specific process exciting the oscillations. 
Remarkably, in linear perturbation theory the QNM frequencies w; mn and damping times T; mn are uniquely determined 
by the mass and angular momentum of the black hole: this is a consequence of the so-called "no hair" theorem of 
general relativity. For this reason, accurate QNM measurements could provide the "smoking gun" for black holes and 
an important test of general relativity in the strong- field regime (see eg. fl7l ]. listing the dominant QNM frequencies 
for rotating black holes and providing fits of the numerical results) . 

In the GW literature it is often claimed that "the ringdown phase is well described by (linear) black hole perturbation 
theory, and that the ringdown waveform is given by a simple superposition of damped exponentials" of the form (|1.1|> . 
This statement is based on two tacit assumptions, that may well be false under the strong-field conditions typical of 
a binary black hole merger: 

(i) Non-linearities are negligible, so that linear perturbation theory applies. However, non-linear effects should be 
present in binary black hole merger waveforms. Model problems show that non-linearities are responsible for a 
systematic shift in the quasinormal frequencies and, more interestingly, for mode coupling (l8j . 

(ii) Mode coupling is negligible. Even if assumption (i) above is valid and non-linear effects are small, different 
multipolar components can still be coupled. For example, artificial mode coupling can arise because of some 
commonly used approximations in the wave extraction process 1 . 

An important open problem in the interpretation of merger waveforms is to isolate general relativistic non-linearities 
from artificial effects induced by unwanted features of the numerical simulations, such as finite-differencing errors, 
errors due to the approximate nature of initial data or spurious rotational QNM coupling induced by the use of 
spin-weighted spherical harmonics in GW extraction. All of these effects depend on the details of the numerical 
implementation and of the wave extraction procedure. The ultimate goal of GW detection is to assess the validity of 
general relativity in the strong-field regime, so the relevance of this problem can hardly be underestimated. Under- 
standing non-linear phenomena unique to general relativity is of paramount importance if we want to discriminate 
Einstein's theory from alternative theories of gravity, that may give different predictions in strong-field situations. 
Since most of the strong-field waveform can be described as a QNM superposition, strong-field effects could well show 
up in the fine structure of this QNM superposition: in particular, non-linearities may manifest themselves as time 
variations of the ringdown frequencies, or as beating phenomena between different QNMs. 

In this paper we take a first step towards the solution of this problem. We consider intrinsic limitations of the 
fitting routines used to extract quasinormal ringing parameters, exploring the properties of different fitting techniques 
for the extraction of ringdown parameters from numerical waveforms. This issue can be studied independently of the 
details of any given merger simulation. The idea is that, by quantifying the limitations of a given fitting method, we 
should be able to isolate systematic uncertainties (due to the fact that we are fitting noisy numerical waveforms) from 
possibly more interesting physical features of the waveforms themselves. 

The problem of estimating the parameters of complex exponentials in noise has a long history in science and 
engineering. Damped sinusoidal signals are the "real-world counterpart" of the harmonic oscillators used in Fourier 
analysis. Therefore, estimating the parameters of damped sinusoids is of primary importance in physics. Examples 
include (i) speech and audio modelling [2l| , (ii) angle-of-arrival estimation of plane waves impinging on sensor arrays 
for radar purposes or radio-astronomy, and antennae array processing [22[ , (in) estimation of lifetimes and intensities 
in radioactive decay [23| , (iv) nuclear magnetic resonance and computer assisted medical diagnosis [24] . This list can 



1 Spin-weighted spherical harmonics are often used to separate the angular dependence of the Weyl scalars and to read off the (I , m) 
multipolar components of the emitted radiation, but the Kerr metric is not spherically symmetric. More correctly, one should use 
spin- weighted spheroidal harmonics [l9| . Spin- weighted spherical harmonics can be expressed as linear superpositions of spin- weighted 
spheroidal harmonics, and this leads to some artificial mode mixing in the extracted waveforms Il2t l2dH . By "artificial" we mean that 
the effect - in this case, mode coupling — is due to approximations we introduce in the simulations, rather than being produced by a 
physical agent (such as non-linearities). 
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be extended to include spectroscopy, the study of mechanical vibrations (including seismic signal processing), the 
diffusion of chemical compounds, economics, and so on. 

Fits of ringdown waveforms are usually performed using standard non-linear least-squares methods (see eg. fl2l I20I ] 
and references therein). These methods are known to fail in estimating parameters for a sum of damped exponentials. 
In this case, minimizing the squared error over the data requires the solution of highly non-linear expressions in terms 
of sums of powers of the damping coefficient. No analytic solution is available, and these expressions can usually be 
solved only with a good initial guess for the parameters [25j ]. Even for a single damped exponential, if the initial 
guess is inaccurate the algorithm often fails to converge. Iterative algorithms, such as gradient descent procedures 
or Newton's method, have been devised to minimize these non-linear expressions. Unfortunately the algorithms are 
computationally very expensive, sometimes requiring at each step the inversion of matrices of dimension as large as 
the number of data samples [26]. Furthermore, gradient descent algorithms for multimodal equations sometimes fail 
to converge to the global minimum. 

Computational difficulties with non-linear least-squares methods led to the development of suboptimal estimation 
methods based on linear prediction equations [lil, HH . These methods are modern variations on a 1795 paper by 
Gaspard Riche, Baron de Prony [27j |. who introduced a procedure to exactly fit N data points by as many purely 
damped exponentials as needed. Modern versions of Prony's method generalize the original idea to damped sinusoidal 
models. They also make use of least-squares analysis to approximately fit an exponential model for cases where the 
data points cannot be fitted by the assumed number of exponential terms. Unfortunately, these "modified least- 
squares Prony methods" are very sensitive to numerical noise. Two successful improvements of these methods, the 
Kumaresan- Tufts (KT) [28| and matrix pencil (MP) [29[ techniques, make use of singular value decomposition to 
improve parameter estimation accuracy in the presence of noise. For excellent reviews of these and other estimation 
methods we refer the reader to the first chapter of [24] (in French) and to Marple's book (25|. A purpose of this paper 
is to introduce these estimation methods to the GW community. 

The plan of the paper is as follows. To put our problem in perspective, in Section [TT] we describe the main features 
of our numerical simulations of binary black hole mergers and of the resulting waveforms. In Section lim we summarize 
the theory behind different estimation methods, and in Section [IVI we list the numerical algorithms we implemented 
in our comparisons. In Section [V] we compare the performance of linear estimation methods (MP and KT) against 
non-linear least squares methods, by performing Monte Carlo simulations of damped sinusoidal signals in Gaussian 
white noise. In Section fVTl we apply different fitting methods to selected waveforms generated by numerical relativity 
simulations. Finally we summarize our results and indicate possible directions for further research. 



II. NUMERICAL SIMULATIONS 



In this Section we briefly describe the numerical simulations of non-spinning, unequal-mass black hole binaries 
previously presented in Q. The simulations were performed with the Bam code [§[ , using the so-called "moving 
puncture" method. This method, originally introduced in Refs. [3, 0], is now being used almost routinely by various 
groups to successfully perform numerical simulations of inspiralling and merging black hole binaries. Our simulations 
of the inspiral of unequal- mass black hole binaries are the same used to study recoil in Ref. @. Details of these 
simulations, of the numerical setup and of the Bam code are given in 

Refs. tali- The main purpose of this paper 
is to test fitting methods to extract QNMs from the simulations. Therefore we defer a detailed investigation of the 
unequal-mass waveforms and of their physical properties to a forthcoming publication (30j . 

In the absence of spin, a standard conformally flat black hole binary initial data set is uniquely determined by the 
parameters mi, m.2, Pi, P%<, D which denote, in this order, the bare masses and linear momenta of the individual 
holes, and their coordinate separation. The specification of these parameters allows us to compute the total Arnowitt- 
Deser-Misner (ADM) mass M of the system, the orbital angular momentum L = (Pi + P2)D, the individual black 
hole masses Mi, Mi and the mass ratio q = M\jMi. Finally, the sum of the individual holes' spins and of the orbital 
angular momentum gives the total ADM angular momentum J of the system. 

In the parameter study under consideration, the mass ratio q was varied while keeping constant the initial coordinate 
separation D = 7 M of the black holes. The linear momenta of the individual holes then follow from the requirement 
that the system represent a quasi-circular configuration. For a given separation D, the momentum of each puncture 
can thus be calculated to third-order post-Newtonian accuracy in the Arnowitt-Deser-Misner, transverse-traceless 
(ADMTT) gauge. The resulting relation is given by Eq. (64) in |8j and forms the basis for all initial data sets 
discussed in this work. The parameters used in our simulations are summarized in Table HI 

In the notation of Ref. [8| , the simulations have been performed using the Xv=2 approach of the moving puncture 
method with a number of grid points i — 56, 64, 72 on the three innermost refinement levels, respectively. This 
corresponds to resolutions of h = 1/45, 1/51 and 1/58. In this work we use the waveforms resulting from the low- 
resolution and high-resolution simulations, using i = 56 and i = 72 grid points, respectively. The wave extraction 
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TABLE I: Total ADM mass, angular momentum and initial binding energy of the unequal mass binaries. 

q M J/M 2 —E b /M 
1.00 0.9935 0.8845 0.0140 
1.49 1.2422 0.8494 0.0134 
1.99 1.4914 0.7870 0.0124 
2.48 1.7408 0.7232 0.0114 
2.97 1.9904 0.6649 0.0104 
3.46 2.2401 0.6132 0.0096 
3.95 2.4899 0.5679 0.0089 

procedure is based on the Newman-Penrose formalism. The Weyl scalar \E"4 is extracted according to the method 
described in Sec. Ill A of [8|. In all simulations presented in this work, <3>4 is calculated at extraction radius r cx = 30M . 
Further details of the numerical setup are given in Ref. 0] . 




FIG. 1: Real and imaginary parts of rMipi m for the two dominant multipoles. The binary has mass ratio q — 1.5. The burst of 
radiation at early times is induced by the initial data. After the initial burst, the real and imaginary parts are simply related 
by a phase shift (i.e., the waveform is circularly polarized). The irregular behavior at late times is due to numerical noise. 

To illustrate some issues related with the extraction of ringdown information from numerical waveforms, in Fig. [1] 
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FIG. 2: \rMipim\ for different multipoles. The high-resolution I — m — 2, I — m — 3 and I = m = 4 wave amplitudes have 
maxima at t pC ak/M = 236.4, 238.4 and 235.7, respectively. Wiggles at late times are due to numerical noise, mainly caused 
by reflections from the boundaries. 



and Fig. [2] we show the complex mode amplitudes 2 rMipi m and their modulus \rMipi m \ for the dominant components 
of the radiation emitted by a binary with mass ratio q = 1.5 3 . Because of the reflection symmetry of the system (see 
[H), real and imaginary parts of the positive-m and negative-m components are related by 

^-m = (-l) l (Vlm)*. (2.1) 

Components with \m\ < I or I > 4 usually contain significant numerical noise, and we will ignore them in the following. 

At early times the waveform is contaminated by a spurious burst of radiation, due to the approximate nature of the 
initial data. After this initial data burst, the frequency and amplitude of the wave grow as the binary members come 
closer, producing the characteristic "chirping" gravitational waveform. Eventually the binary members merge, the 
wave amplitude reaches a maximum, and then it decays as the remnant black hole settles down to a stationary Kerr 
state, emitting ringdown waves. The final part of the ringdown waveform is visibly contaminated by some amount of 
numerical noise, that decreases as we increase the resolution. This noise is mainly due to radiation being reflected 
from the boundaries of the computational domain. 

The plots clearly show that the real and imaginary parts of the waveform follow the same pattern, except for a 
(roughly) constant phase shift. This means that the waveform is circularly polarized [3l| . From the point of view 
of extracting information from the ringdown, circular polarization means that fitting the real or the imaginary part 
should make no difference: to a good approximation, we should get the same results for the oscillation frequencies 
and damping times. Fitting methods capable of directly dealing with complex waveforms (such as the Prony-type 
methods considered in this paper) should be particularly useful for waveforms with general polarization, such as those 
that should be emitted by the generic merger of spinning, precessing black holes. 

Independently of the chosen fitting method, there is some arbitrariness in the choice of the time window [to, tf] 
used to perform the fit. A well-known problem with the transition merger-ringdown is that we do not know a priori 
when the ringdown starts (see [T3, H(| H3 and references therein) . Ideally, the starting time to should be determined 
by a compromise between the following requirements: (i) to should be small enough to include the largest possible 
number of data points: in particular, we do not want to miss the large amplitude, strong-field part of the waveform 
right after the merger; (ii) to should be large enough that we do not include parts of the waveform which are not well 



2 These amplitudes are projections of the Weyl scalar <I/4 onto spin-weighted spherical harmonics of spin-weight s = —2. For the definition 
see, eg., Eq. (39) of Q, where the amplitudes are denoted by A; m . Here we choose a different notation, to avoid confusion with the 
QNM amplitudes Ai mn . 

3 Here and in the following we label the Bam runs by the corresponding value of q, rounded to the first decimal digit (see Table H). 
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described by a superposition of complex exponentials: the inclusion of inspiral and merger in the ringdown waveform 
would produce a bias in the QNM frequencies. 

A judicious choice of tf is also necessary. Usually we would like the time window to be as large as possible, but a 
glance at Fig. Q] and Fig. [5] shows that the low amplitude, late-time signal is usually dominated by numerical noise, 
mainly caused by reflections from the boundaries. This noise can reduce the quality of the fit, especially for the 
subdominant components with I > 2 and for large values of to- A practical criterion for the choice of tf is suggested 
by a look at Fig.O If the ringdown waveform were not affected by noise from boundary reflections, \rMipi m \ should 
decay linearly on the logarithmic scale of the plots 4 . At low signal amplitudes, we see boundary noise- induced wiggles 
superimposed on this linear decay: the first occurence of these wiggles is a good indicator of the time tf at which 
numerical results cannot be trusted anymore. To test the robustness of fitting results to late-time numerical noise, 
while at the same time keeping the largest number of data points in the waveform, we decided to use two different 
"cutoff criteria" : 

1) "Relative" cutoff: remove from the waveforms all data for times t > tf = t rc i, where i rc i is the time when 
the amplitude of each multipolar component \rMipi m \ becomes less than some factor '(/'cutoff times the peak 
amplitude (at i pca k ~ 240 M for the waveforms in Fig. [2]) : 



\rM^ lm (t iel )\ 

< ^cutoff • (2.2) 



\rMip lm (t peak )\ 

2) "Absolute" cutoff: remove from the fit all data with t > tf = i a bs, where i aDS is the time at which the absolute 
value of the amplitude \rMipi m \ < V'cutoff/lO. 

The choice of the cutoff amplitude is somewhat arbitrary. We chose V'cutoff = 10 -3 for low resolution, and V'cutoff = 
10 -4 for high resolution. 

For each chosen tf, we will look at the performance of the fitting routines as we let to vary in the range [i poa k, tf]. 
We do this for two reasons. The first reason is physical: by monitoring the convergence of the QNM frequencies to 
some "asymptotic" value as to — > oo, we can tell if the black hole settles down to a stationary Kerr state, or if, on 
the contrary, non-linearities and mode coupling are always present. The second reason is related with the main goal 
of this paper, which is to assess the performance of our fitting routines: as to grows the signal amplitude decreases 
exponentially, and we effectively reduce the signal-to-noise ratio (SNR) in our fitting window. Robust fitting methods 
should give reasonable results even for large values of to (that is, modest values of the SNR). 



III. A BRIEF SURVEY OF ESTIMATION METHODS FOR DAMPED SINUSOIDS 



In this Section we present a brief summary of the theory behind different estimation methods for damped sinusoidal 
signals. We consider only methods that make direct use of the data, by which we mean that no use is being made of 
the autocorrelation function. The measured signal a; is a linear superposition of the "true" waveform, 'J, and noise, 
w. We suppose we have N samples of the signal, equally spaced in time with time sampling interval T, and we label 
each sample by an integer n: 

x [n] = &[n] + w[n] , n = l,...,N. (3.1) 

For simplicity we assume the noise w[n] to be white and Gaussian, with standard deviation a and mean fi = 0. We are 
interested in the ringdown waveform '5 [n] , that we write as a superposition of p complex exponentials with arbitrary 
amplitudes and phases: 

p 

y{n] = V A k e ( - ak+iuJk){n - 1)T+iVk . (3.2) 

k=l 



4 With larger resolution and longer running times, eventually the exponential decay should turn into the well known power-law tail induced 
by backscattering of the radiation off the spacetime curvature [33j] . In the simulations we consider, noise produced by boundary effects 
is large enough that this effect is not visible. 
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It is useful to re-cast Eq. (|3.2p in the slightly different form 

fe=i 

with 

h k = A k e^, (3.4) 
z k = e (^+^) T . (3.5) 

Now the unknown complex parameters are {h k , z k } (k = 1, . . . ,p), and possibly the number p of damped sinusoids. 

A. Non-linear least-squares 

A popular estimation method is non-linear least-squares, which consists in minimizing the integrated squared error 

N 

p = J2\w[n}\ 2 , (3.6) 

n=l 

with w[n] = x[n] — ty[n]. The method is very general, in the sense that (in principle) it can be applied to any model 
function ty[n]. For ringdown waveforms of the form (|3.3|) we must minimize over the {h k , z k } parameter space (and 
possibly the p— space, if the number of damped exponentials is not known a priori) in an essentially non-trivial way. 
The procedure is computationally expensive and not always accurate: the solution may converge to a local (rather 
than a global) minimum of the integrated squared error. Experiments show that non-linear least-squares techniques 
perform badly in estimating the parameters of damped exponentials. The situation gets even worse in the presence 
of noise [24l |25| . In our numerical work, to minimize the integrated squared error we used the well-known Levenberg- 
Marquardt algorithm (34| as implemented in the Fortran subroutine llmdifl which is part of the MINPACK library for 
solving systems of non-linear equations [35| 5 . 



B. Prony method 

Computational difficulties with non-linear least-squares methods led to the development of suboptimal estimation 
methods, especially designed to deal with damped sinusoidal signals. The prototype of these estimation techniques is 
the Prony method, which is essentially a trick to reduce the non-linear minimization problem to a linear prediction 
problem. The method has been successfully tested in different branches of data analysis and signal processing. Some 
variants of the basic idea improve the variance and bias of parameter estimation in the presence of noise: we briefly 
discuss these variants in the following Sections. Our introduction to standard Prony methods parallels those by 
Marple [25[ and Djermoune [24| , We refer the reader to those references for further details. 

To start with, let us assume there is no noise in our data set. Let us also assume that there are as many data 
samples as there are complex exponential parameters (hi, . . . , hp), (zi, . . . , z p ): N = 2p. We are then fitting x[n] to 
an exponential model, 



For 1 < n < p we can write this in matrix form as 

/ z°i z% ••• 

1 1 



fe=i 



(3.7) 





(hl\ 




( x[l] \ 








x[2] 


) 






\ x]p] J 



(3.8) 



5 The same algorithm is also implemented in Mathematica |36l| 
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If we can determine the z k s by some other procedure, then (|3.8[) is a set of linear equations for the complex amplitudes 
h k . Prony 's method is in essence a trick to determine the z^'s without the need for non- linear minimizations, as follows. 
Let us define a polynomial A(z) of degree p which has the z^'s as its roots: 



fe=l m=0 

Let us also normalize the complex coefficients a[m] so that a[0] = 1. By using Eq. 

p 

i[m] ^2 h k i 



we can write 



a\m\x\n — m \ — a\ 



~n— m— 1 



fc=l 



Summing this last equation from m — to m — p we get, for p + 1 < n < 2p 

p p p 



a[m]x[n — m] = hiZ™ p a[m]zf m 1 = . 



(3.9) 



(3.10) 



(3.11) 



m=0 



m—0 



where the last equality follows from (|3.9p . This is a (forward) linear prediction equation. In matrix form it can be 
expressed as 



/ x[p] x[p — 1] • • • x[l] \ 
x[p + 1] x[p] ■ ■ ■ x[2] 



\x[2p-l] x[2p-2] ••• x[p] J \ a[p] J 



(a[l]\ fx[p+l}\ 
o[2] x[p+2] 

V x[2p] J 



(3.12) 



This equation is the basic result of the Prony method: it shows that we can determine the a[fc]'s from the data, 
decoupling the problem of determining the hk and Zk parameters. 

The original Prony procedure works as follows. First, given the data, solve (|3 . 1 2[) for the coefficients a[m). Then 
determine the roots Zk of the polynomial (|3.9|) . The damping and frequency are obtained by inverting (13. 5| : 

(3.13a) 
(3.13b) 



a k = log \z k \/T , 
uj k = tan -1 [Im(z fe )/Re(zfe)] /T . 

Finally, solve ()3.8j) for the complex quantities h k and invert (|3.4| to find amplitudes and phases 

A k - \h k \, 

ip k = tan -1 [lm(h k ) /Re(h k )} . 



(3.14a) 
(3.14b) 



It should now be clear that the Prony algorithm reduces the non-linear fitting problem to the trivial, computationally 
inexpensive numerical tasks of (i) solving linear systems of equations, and (ii) finding the roots of a polynomial. 



C. Modified least-squares Prony 

For most situations of interest, there are more data points than there are exponential parameters: N > 2p. In this 
case, Eq. (|3.12p is modified to 



:[p + 1] x[p 



We can write this as a matrix equation, 



which can be solved in the least-squares sense 



/ x[p] x[p-l] ••• x[l] \ / a[l] \ 



2] 





( o[l] \ 




o[2] 


) 





Xa = — x . 



a=-(X w X) _± X^x, 



( 4p + 1] \ 

x[p + 2} 
V J 



(3.15) 



(3.16) 
(3.17) 
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where a superscript H denotes Hermitian transpose: X ff = (X T )*. The important difference is that we must now 
minimize the linear prediction squared error, rather than the exponential approximation squared error of Eq. (|3.6[) . 
Once we have determined the a[fc]'s, the rest of the Prony method carries on in the same manner. Details of the 
implementation of this "least-squares Prony method" (which is sometimes called "extended Prony method" ) can be 
found in [251 ] . 



D. Kumaresan- Tufts 



Unfortunately, the original and least-squares Prony methods only work well in the absence of noise, or for large 
SNR. For small SNR, the estimation of the frequencies and damping times has large variance and bias [25|, H3] ■ 

In the previous Section we explained how to estimate the exponential parameters by introducing the characteristic 
polynomial A(z), which has roots at Zk = e Sk = e^ ak+luJk " >T . The coefficients a[k] of A(z) are solutions of the forward 
linear prediction equation (|3. 1 1|) . These same exponentials can be generated in reverse time by the backward linear 
predictor [25| 



b[m]x[n - p + m] = 0. (3.18) 

m=0 

in which b[0] = 1. The characteristic polynomial 

p 

B(z) = b*[m]z p - m , (3.19) 

771=0 

has roots at Zk = e~ Sfc . For damped sinusoids (Re[«fc] < 0) it can be shown that the roots of the forward linear predic- 
tion polynomial A(z) lie inside the unit z— plane circle, whereas those of the backward linear prediction polynomial 
B(z) lie outside the unit z— plane circle [25l. |28| . 

Suppose now we superimpose to the signal complex additive Gaussian white noise. Noise causes a bias in the 
estimates of the true zeros of the polynomials, which translates into a bias in the estimates of frequencies and 
damping times. It was empirically observed that the bias may be significantly reduced by looking for a number of 
exponential components L > p, where p is the actual number of exponentials present in the signal [281 ] . L is called the 
prediction order of the model. Of course, the selection of a higher prediction order introduces additional zeroes due 
to noise, but these can be statistically separated by examining the zeroes of A(z) and B(z). For both polynomials, 
zeroes due to the noise tend to stay within the unit circle, whereas the true zeroes due to the exponential signal form 
complex-conjugate pairs inside and outside the unit circle. This is basically due to the fact that the statistics of a 
stationary random process do not change under time reversal. Using singular value decomposition (SVD) can provide 
further improvement [2J, [25|, [28j] ■ Write X in (|3.16[) as 

X = XJSV H , (3.20) 

where S is a (N — L) x L dimensional matrix with the singular values on the diagonal (si , ... , s p , s p+ i , ... , sl) arranged 
in decreasing order. Noise can be reduced by considering the reduced rank approximation 

X = USV ff (3.21) 



with 

T <s n 1 

(3.22) 



s P 





(N-L)xL 

Here S p is the top-left p x p minor of S. An estimate for the coefficients a[k] is then 

a= X+x, (3.23) 

where X + is the pseudo-inverse of X. It was found [28| that the use of the truncated SVD greatly enhances the SNR, 
providing a better estimate of the vector a, and consequently of the exponential parameters. This is the basic idea 
underlying the Kumaresan- Tufts method. More details can be found in the original work [28| (see also @, For 
improvements of this method using total least-squares, see for instance [381 ] . 
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The frequency variance depends on the prediction order L. For one undamped exponential (a = 0) of amplitude h± 
it can be shown [2^, [3t| that the variance is given by 

™<"> = HN n lmLi) W for L - N/2 - (3 - 24a) 

Minima are attained for L ~ 7V/3 (L < iV/2) and for L ~ 2iV/3 (L > N/2), these relations being exact in the limit 
iV — > oo. Correspondingly, the optimal frequency variance is 

27 er 2 

VarH "4]vW (3 ' 25) 
which is only slightly larger than the Cramer-Rao bound: 

the ratio being 9/8 = 1.125 in the limit N — > oo. For one damped exponential, closed-form expressions for the variance 
and bias of the damping time in the large SNR limit can be found in [13, EH . 



E. Matrix Pencil 



An alternative to estimate exponential parameters from noisy signals is the matrix pencil (MP) method [2£|. The 
MP method is in general more robust than the KT method, having a lower variance on the estimated parameters, 
but a slightly larger bias 0, SH ■ For a detailed description of this method we refer the reader to [29[ . The following 



brief summary follows quite closely the treatment in [2 
/ 

X = 

\x[N -L-l] x[N — L] ■■■ x[N -2] J 



x[0] 
x[l] 



x[l] 
x[2] 



x[L - 1] \ 

x\L] 



Let Xo and Xi be two matrices defined as 
/ 



Xi 



x[l] 
x[2] 



x[2] 
x{3] 



x[L] \ 
x[L + l] 



(3.27) 



\ x[N -L] x[N — L] ■■■ x[N - 1] / 



Here L is the pencil parameter: it plays the role of the prediction order parameter in the KT method. One can 
decompose Xq and Xi as 



Xo — Z;HZ r 

Xi = Z/HZZ r 



where 



'A = 



1 



1 

Z-2 



\* 



N-L-l N-L-l 



..N-L-l 
Z P 



/I «1 

1 Z2 

V 1 z P 



y L-l 



(3.28) 
(3.29) 



(3.30) 



H 

Z 



diag(/ii , h 2 
diag(zi , z 2 ■ 



■ hp) ■ 



Consider now the matrix pencil 6 Xi — zXo. We can write this as 

Xi — zXq = Z;H (Z — zip) Z,. 



(3.31) 
(3.32) 



(3.33) 



Let A and B be two n X n matrices. The set of all matrices of the form A — XB, with A complex, is called a matrix pencil. 
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When z ^ Zi, the matrix Z — zl p is of rank p. However, for z = z% it is of rank p — 1. Therefore the poles of the 
signal reduce the rank of the matrix pencil for p < L < N — p. This is equivalent to saying that the poles Zj are the 
generalized eigenvalues of (Xi ,Xo), in the sense that (Xi — zXo)v = 0, with v an eigenvector of Xi — zXo. To find 
the poles Zi one can use the fact that X^Xi has p eigenvalues equal to the poles Zi, and L — p null eigenvalues [24j . 
Here a dagger denotes the (Moore-Penrose) pseudo-inverse. 

In practice we do not have access to the noiseless signal, therefore we must work directly with the noisy data. 
SVD is again necessary to select the singular values due to the signal. The basic steps of the MP method can be 
summarized as follows: 

(i) Build the matrices Yo and Yi as in (|3.27j) : 

(ii) Make a SVD of Yf. Y x = USV T . 

(iii) Estimate the signal subspace of Yi by considering the p largest singular values of S: Yi = XJ p S p Vp , where U p 
and V p are built from the first p columns of U and V, and S p is the top-left p x p minor of S. 

(iv) The matrix Z L = Y+ Y = V p Sp 1 UjY has p eigenvalues which provide estimates of the inverse poles \/zi, the 
other L — p eigenvalues are zero. Since has only p non-zero eigenvalues, it is convenient to restrict attention 
to a p x p matrix Z p = S~ 1 UjY V p [29| . 

The MP technique exploits the matrix pencil structure of the underlying signal, rather than the prediction equations 
satisfied by it. Nevertheless, there are strong similarities between the MP and KT methods. An extensive comparison 
of their performance and theoretical properties can be found in [24|, H^, Sl[ . 

As for the KT method, the variance of parameter estimation depends on L. For one undamped exponential of 
amplitude hi it can be shown (29| that the variance is given by 

var H = J^J^^ ^ L < N/2 , (3.34a) 

var H = j^j^^ ^ L > N/2 , (3.34b) 

i.e. it is always lower than the corresponding variance for the KT method, as given in Eq. (|3.24[) . Choices minimizing 
the frequency variance are L = N/3 for L < N/2, and L = 2N/3 for L > N/2, and the optimal frequency variance is 

27 er 2 

VarH= 4AW' (3 ' 35) 

as in the KT method. For one damped exponential, closed-form expressions for the variance and bias of the damping 
time in the large SNR limit have recently been derived and compared with Monte Carlo simulations [4l| ■ The results 
indicate that the MP method performs better (has smaller variance) than the KT method. For both methods the 
frequency estimate is unbiased, but the estimate of the damping time is biased. The bias is slightly larger for the MP 
method, the difference between the two methods becoming larger for small SNR. 



IV. FITTING ALGORITHMS CONSIDERED IN THIS PAPER 



In the rest of this paper we will compare the performance of the four fitting algorithms described in the previous 
Section: Levenberg-Marquardt (LM), modified least-squares Prony, Kumaresan- Tufts (KT) and matrix pencil (MP). 
For quick reference, in Table |H] we list the original papers, along with web resources and publications providing 
software implementations of each algorithm (in Fortran, MATLAB or Mathematica) . Below we give some details on 
our own practical implementation of the algorithms. 



A. Levenberg-Marquardt 

The LM algorithm is more general than the other methods we consider in this paper, in the sense that fitting 
functions are not restricted to a simple sum of complex exponentials. A problem if we want to fit merger waveforms 
is that LM (like most non-linear least-squares methods) is designed to deal with real functions. For this reason the 
authors of who studied waveforms similar to those in Fig. [1] fitted only the real (or imaginary) part of the 
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TABLE II: Fitting methods used in this paper. El-Hadi Djermoune and Stanley Lawrence Marple Jr. kindly sent us up-to- 
date implementations of the codes described in [24[ and [25f | . We are also grateful to Gordon Smyth for providing us with a 
MATLAB routine to estimate parameters for purely damped exponentials, and to Boaz Porat for Mathematica packages to 
estimate parameters of undamped sinusoids using several methods (KT, maximum likelihood, Yule- Walker...): see [42| . 

Method Reference Software 

Levenberg-Marquardt (LM) [34] [35, 36] 

Modified least-squares Prony [25] [25, 43, 44, 45] 
Kumaresan- Tufts (KT) " [28] ~ [24, 46] ~ 
Matrix pencil (MP) [29] [24] 

signal. Some practice also shows that general non-linear least-squares methods (such as the LM algorithm) often 
fail to converge if the signal is given by a superposition of damped sinusoids, unless we provide very accurate initial 
guesses for the fitting parameters. 

For simplicity, and to take into account these limitations of non- linear least-squares algorithms, in Section fVl we 
will compare the performance of different routines on a real ringdown signal with a single frequency and damping 
time. We choose some starting time to and we fit the real signal by a four-parameter function: 

*(i) = Ae a{t - to) sm(iut + ip) , (4.1) 

The LM algorithm provides reasonable answers only if the four parameters (A, ip, uj, a) are reasonably close to their 
"true" values. In particular, an accurate initial guess for A is necessary to avoid "hard failures" of the fitting routine: 
we choose as initial guess the value of the signal amplitude at t$. We fix the tolerance parameter in the lmdif routine 
to be T0L= 10~ 12 (but this choice is not crucial). 

As a final remark we point out that, by using a real signal, in a sense we are "biasing" our tests in favour of the 
LM method. The reason is that fitting a real signal requires the inclusion of at least two complex exponentials in the 
sum p.2[) . unnecessarily doubling the number of unknown parameters to be searched for by Prony methods. 

B. Modified least-squares Prony, Kumaresan- Tufts and matrix pencil 

A Fortran routine implementing modified least-squares Prony was kindly provided to us by Stanley Lawrence Marple 
Jr. [25| . We tested the routine's performance on noiseless "pure ringdown" waveforms obtained by superposing three 
damped exponentials with frequencies given by the first three QNMs (n — 0, 1, 2) of a Schwarzschild black hole [l7l |. 
For noiseless waveforms, the frequency and damping time of the fundamental mode (n = 0) are typically determined 
with accuracies ~ 10 -6 . The frequency and damping time of the two overtones usually have accuracies ~ 10 -5 . 
Parameter estimation becomes much worse, as expected from the theory, when we add even a modest amount of noise 
to the waveforms. 

Performance in the presence of noise gets much better when we use two MATLAB routines by El-Hadi Djermoune 
24], implementing the Kumaresan- Tufts and matrix pencil methods. These routines require the specification of the 
prediction order (that we always choose to be L = N/3, where N is the number of data points, in order to minimize 
the variance in parameter estimation). In addition, they also require the specification of the number of complex 
exponentials to be searched for. Djermoune kindly provided us with an additional routine to estimate the number of 
complex exponentials present in the signal, or (in more technical jargon) the "order of the exponential model" . The 
estimation routine is based on two criteria: Akaike's information criterion (AIC) and the minimum description length 
(MDL) criterion [illil . 

When tested on noiseless waveforms, Djermoune's routines are remarkably successful at estimating the number of 
modes present in the signal. Frequencies and damping times for noiseless signals are typically determined to accuracies 
~ 10~ 15 (comparable to machine precision). 

We will see in the following that MP and KT methods have essentially the same performance. In fact, in the next 
Section we will show by numerical experiments that MP methods have slightly smaller variance and bias in parameter 
estimation: this was pointed out also by Djermoune and collaborators (see eg. @, SH H^] ) . Due to the similarity of 
the two techniques, and due to the slightly superior performance of MP over KT, we will usually compare MP and 
LM methods in our analysis of merger waveforms (Section I VII below) . 

As a final remark, we point out that MP and KT have a variance in frequency estimation which is only ~ 9/8 = 1.125 
times larger than the Cramer-Rao bound. These algorithms may prove extremely useful not only to study the merger 
waveforms produced by numerical relativity (as we do here), but also to estimate the source parameters after GW 
detection. MP and KT algorithms are most effective and useful when we deal with noisy waveforms, and these 
situations are often the most interesting. For example, fitting the late-time, low-amplitude portion of a numerical 
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relativity waveform yields the remnant black hole's parameters. Fitting large-? multipolar components (that carry 
little energy, hence are more affected by numerical noise) is necessary for tests of the general relativistic no-hair 
theorem [17]. Last but not least, the importance of parameter estimation from experimental, noisy GW data can 
hardly be underestimated. 

V. COMPARISON OF DIFFERENT METHODS ON NOISY DAMPED SINUSOIDS 

To test the performance of different fitting methods, we built model waveforms reproducing the essential features 
of the ringdown waveforms produced by binary black hole merger simulations (see Fig. [IJ . For simplicity, and to take 
into account the limitations of the LM algorithm, here we consider our "true" signal as consisting of a single damped 
sinusoid. We take the frequency and damping time to be those of a the fundamental (n = 0) I = m = 2 perturbations 
of a Kerr black hole, as listed in [ijj. For a given oscillation frequency to, we produce a signal lasting five GW cycles: 
i.e. the signal length t^ n = 5(2ir/uj) = 5Tqw- The sampling time is taken to be T = Tqw/50, so that the "full" 
waveform consists of 250 data points. 

To this "true" signal we superimpose Gaussian white noise. In this Section we do not consider standard least-squares 
Prony, since it is outperformed by MP and KT algorithms in the presence of noise. The MP and KT algorithms are 
implemented in MATLAB. In this case, we produce Gaussian noise using the built-in function normrnd. For our 
Fortran implementation of the LM algorithm we gene rate noise as a random variable va with mean (i — and 
standard deviation a using the Box-Miiller method [50] : for each t, given two random numbers U\ and u 2 uniformly 
distributed in [0, 1] (as generated with the Numerical Recipes routine Iran2l p5l|), we add to our signal 

w{t) = /i + <7\J —2 ln(iti) cos(27ru 2 ) . (5-1) 




FIG. 3: QNM waveforms with and without Gaussian white noise. We normalize the time axis to the total duration of the 
signal tfl n . The "pure" noiseless waveform (thick black line) has unit amplitude. Dashed (red), dotted (blue) and thin (green) 
lines are the same waveform superimposed to Gaussian white noise with a — 10~ 4 , 10~ 3 , 10~ 2 , respectively. 

For illustration, in Fig. [3] we show a typical waveform generated in this way. The "true" signal (with a — 0) corre- 
sponds to a perturbed Schwarzschild black hole, i.e. the frequency and damping time correspond to the fundamental 
I = m = 2 QNM for a black hole of dimensionless spin parameter j = [ItJ ■ For any given a, the addition of Gaussian 
white noise has a less severe impact for large values of j. The reason is that the damping time of the I = m = 2 mode 
grows with the rotation parameter j (see eg. Table II in [17|). A Schwarzschild waveform with a = 10~ 3 is quite 
similar to the typical merger waveforms of Fig. [TJ and for this reason we will use it to test the performance of the 
different fitting routines. Results for different values of a are qualitatively similar. 

For each fitting algorithm, we perform Monte Carlo simulations to produce a large number (iV no i se = 100) of 
realizations of the noise with a = 10~ 3 . For each realization of the noise (y — 1, . . . , A noisc ) we fit the waveform 
varying the starting time from to = to to = ifm- Each fit yields a couple of values Wi,(£o), ot v (to), and from these we 
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can deduce the quality factor of the oscillation Q v (to) = \u> u (ta) /\2a v (to)\\ [l7|. For each to we compute the standard 
deviation and bias of each of these fitted quantities x (= u>, a or Q) from our Monte Carlo simulations: 




where xtvuc is the known, true value of each quantity and a bar denotes the average over all noise realizations: 



Results of these Monte Carlo simulations are shown in Fig. [U There we see that all three methods (KT, MP and 
LM) are essentally equivalent in terms of variance 7 . Standard deviation and bias grow sharply for ioAfin ^ 0.7, when 
we are fitting that part of the signal which is buried in noise (compare Fig. [3]). The bias is usually very small, but 
remarkably linear methods (KT, and especially MP) beat the non-linear LM fitting routine in terms of bias on the 
frequency. In terms of bias on the damping time, LM performs slightly better than MP and KT only for large values 
of to , when the SNR is very small. 

When comparing the performance of the different methods it is useful to remember that LM only works if we give a 
good initial guess for the parameters we want to estimate, whereas MP and KT automatically find the values of these 
parameters, with no need for initial guesses. Our comparison is somehow "biased" in favour of the LM method in at 
least three ways: (i) we choose optimal initial guesses for the parameters, so that the LM algorithm is not "allowed" 
to converge to some wrong root; (ii) we choose a real signal rather than a complex signal, because the LM algorithm 
only works for real data sets; (iii) we only consider one damped exponential (LM often fails to converge if we include 
additional damped sinusoids, whereas Prony-type methods are still remarkably successful). 



Here we turn to the problem that motivates the present analysis of fitting algorithms for complex exponentials in 
noise. If the ringdown radiation emitted as a result of a binary merger shows no signs of non-linearities or mode 
coupling, our fits should have a simple behavior: as we increase the starting time for the fit to, the black hole's 
oscillation frequencies coimn and damping factors ai mn should asymptotically approach a constant, whose value is 
predicted by linear perturbation theory [171 ]. 

In Fig. [5]we show results of QNM fits performed on the I = m = 2 component of low-resolution and high-resolution 
merger simulations with mass ratio q = 1.5. Solid lines refer to the "absolute" truncation criterion of Section HH and 
dashed lines to the "relative" truncation criterion. Different truncation criteria affect the estimated parameters only 
for low-resolution simulations, and at late starting times (to/M > 290). Results obtained fitting by a single complex 
exponential with standard least-squares Prony 8 are slightly off from the "best" fitting methods (MP, KT and LM). 
Not surprisingly, there is remarkable agreement between KT and MP methods, and very good agreement between 
these two methods and the non- linear least-squares LM fit. 

The main conclusion to be drawn from Fig. [5] is that all fitting routines consistently predict that frequencies and 
damping times have small (but non-trivial in structure) time variations. Relative variations in u) are of order ~ 0.5%, 
with the frequency reaching a local maximum at to/M ~ 272. Relative variations in a are slightly larger (roughly 
~ 2.5%), with \a\ attaining a maximum (and the damping time correspondingly attaining a minimum) for to/M ~ 280. 
Notice also that increasing the resolution sensibly reduces the irregularities in the predicted frequency at all times, 
and produces a flattening of the estimated parameters for 280 < to/M < 300. The decrease in u> and in \a\ for 
280 < to/M < 300, that are visible in the left panels, are clearly an artifact of insufficient resolution. It would be 
interesting to perform Richardson extrapolation of the numerical results to determine if oscillations in the estimated 



7 We also tested the MP and KT methods on pure sinusoids, finding good agreement with theoretical predictions for the frequency 
variance. 

8 Figs. [5] and [6] only display results for least-squares Prony obtained using the "relative" truncation criterion. However, we did check 
that using the "absolute" truncation criterion has a very small effect on the estimated parameters. We also checked that increasing the 
number of exponentials we search for does not improve the agreement of standard least-squares Prony with other methods. 




(5.4) 



VI. APPLICATION TO MERGER WAVEFORMS 
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FIG. 4: Standard deviation (left) and bias (right) in the estimate of frequency, damping time and quality factor (top to bottom). 
All quantities are given as functions of the starting time of the fit to (normalized by the duration of the signal tan); each point 
is the result of a Monte Carlo simulation obtained by adding iVnoise = 100 realizations of Gaussian white noise with zero mean 
and a = 10~ 3 to the j = waveform of Fig. [3] Solid (black), dashed (red) and dotted (blue) lines refer to the MP, KT and 
LM algorithms, respectively. 
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FIG. 5: Performance of different Prony methods in estimating the oscillation frequency u) (top) and damping factor a (bottom) 
for low and high-resolution runs (left and right, respectively). For concreteness we choose a binary with q = 1.5 and consider 
the fundamental mode with I = m = 2. Prony-type results refer to the complex waveform, while LM results are obtained by 
fitting the real part of the waveform only. 



parameters (which could be a sign of "new" physics) disappear in the limit of infinite resolution. We plan to address 
this problem in the near future. An analysis of the fine structure of the signal, and of its implications for gravitational 
wave phenomenology, will be presented in a forthcoming publication [3fj| . 

For the time being, we simply point out that such systematic variations in the oscillation frequency and/or damping 
time could be signs of non-linearities and/or mode coupling in the numerical simulations. Variations could be due 
to the black hole's mass and angular momentum changing on a timescale which is longer than the QNM oscillation 
period, or to beating phenomena with other QNM frequencies. In our opinion, the fact that all fitting methods 
consistently predict the same "global" structure is convincing evidence that the existence of maxima and minima is 
not due to numerical errors in the fit. 

In Fig. [S] we plot the quality factor of the oscillations as a function of to/M. The time variation of Q is basically 
dominated by the time variation of a. This is quite obvious, since Q = u>/(2a) and time variations in a are larger 
than time variations in u>. The interest of plotting Q(to/M) comes from the fact that, in linear theory and for / = m, 
Q is a monotonic function of j 17]. Roughly speaking, large variations in Q for the fundamental oscillation mode 
could mean that the Kerr angular momentum parameter is changing in time, or that there are significant corrections 
to the linear approximation. 
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Finally, in Fig. [7] we compare the performance ol our two "best" fitting methods (MP and LM) in estimating 
oscillation frequencies and quality factors for different mass ratios: q — 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0. Deviations in 
Q between the two methods are quite significant, especially for large mass ratio, when the numerical simulations are 
less reliable. For q = 4.0 and low resolution, the LM method shows larger variations in u> than the MP method. This 
could be an effect of the larger bias in frequency of the LM method, as compared with the MP method (cf. Fig. [4|. 
Once again, increasing the resolution produces a flattening of all curves, the effect being more pronounced for large 
mass ratios. A more systematic analysis of simulations for different mass ratios will appear elsewhere (30j . 



VII. CONCLUSIONS AND OUTLOOK 



Our understanding of binary black hole mergers has recently been revolutionized by the success of numerical 
relativity simulations. A partial disappointment came from the remarkable simplicity of the inspiral-merger-ringdown 
transition: the non-linearities of Einstein's theory do not seem to leave spectacular imprints on the merger, that seems 
to be a very short phase smoothly connecting the familiar "inspiral" and "ringdown" waveforms jl2| . 

In this paper we argued that non-linearities could show up in the fine structure of the ringdown waveform, as 
systematic time variations of the ringdown frequencies and damping times. We also showed by explicit calculations 
that such time variations are actually present in numerical waveforms (see Figs. O [5] and [7]) ■ The variations we are 
looking for are typically ~ 100 times smaller than the QNM frequencies themselves. This smallness calls for accurate 
parameter estimation methods to extract ringdown frequencies from numerical simulations. We considered a class of 
well-studied and robust parameter estimation methods for complex exponentials in noise, which are modern variations 
of a linear parameter estimation technique first introduced in 1795 by Prony. 

The comparison of different fitting methods can help resolve actual physical effects from systematic parameter 
estimation errors, due to the variance and bias of each particular fitting algorithm. For this reason we compared two 
variants of the original Prony algorithm (the Kumaresan- Tufts and matrix pencil methods [28t l.29j]) against standard 
non-linear least-squares techniques, such as the Levenberg-Marquardt algorithm. We found that the two classes of 
methods have comparable variance, but Prony-type methods tend to have slightly smaller bias. Prony methods have 
a number of advantages with respect to standard non-linear least-squares techniques: 

1) They do not require an initial guess of the fitting parameters. 

2) They provide us with a simple, efficient way to estimate QNM frequencies for the overtones, and even to estimate 
how many overtones are present in the signal [47], . 

3) They are explicitly designed to deal with complex signals, so they should be most useful for "generic" waveforms, 
such as those produced by spinning, precessing black hole binaries. In the case of non-trivial polarization of the 
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FIG. 7: Oscillation frequency (top) and quality factor (bottom) for modes with I — m — 2 and for low and high-resolution runs 
(left and right, respectively), as a function of the starting time of the fit. Thick lines use MP, thin lines use LM. Values of the 
quality factor for t/M < 250 are often unphysically large. We truncate the waveform at late times when the absolute value 
of the amplitude drops below 10 -4 (for low resolution) or 10 -5 (for high resolution). In each panel, lines from top to bottom 
refer to q = 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0. 



gravitational radiation it might become crucial to fit simultaneously the real and imaginary parts of the signal 
to exploit optimally the information carried by both degrees of freedom. 

4) Statistical properties of Prony-based methods in the presence of noise (such as their variance and bias) are well 
studied and under control. Our Monte Carlo simulations suggest that the statistical properties of Prony-like 
methods as we vary the SNR are more consistent than for non-linear least-squares methods, even when we 
consider a single damped sinusoid (the dotted blue lines in Fig. 3] have a very irregular behavior). 

The methods introduced in this paper should be useful both theoretically and experimentally. From a theoretical 
standpoint, besides helping in the search for non-linearities, Prony methods can also be used as "diagnostic tools" for 
numerical simulations. For example, monitoring the time variation of ringdown parameters from different multipolar 
components of the radiation we can check that the end-product of a merger is indeed consistent with the Kerr solution. 
An analysis of presently available numerical waveforms based on Prony methods is ongoing [30| . 

So far, Prony algorithms have generally been overlooked by the GW data analysis community. We believe that the 
techniques presented in this paper should prove useful to extract science from noisy ringdown signals in the (hopefully 
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not too distant) future, when GW detection will finally become a reality. 
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